1,486 research outputs found

    Vision-based Detection of Acoustic Timed Events: a Case Study on Clarinet Note Onsets

    Get PDF
    Acoustic events often have a visual counterpart. Knowledge of visual information can aid the understanding of complex auditory scenes, even when only a stereo mixdown is available in the audio domain, \eg identifying which musicians are playing in large musical ensembles. In this paper, we consider a vision-based approach to note onset detection. As a case study we focus on challenging, real-world clarinetist videos and carry out preliminary experiments on a 3D convolutional neural network based on multiple streams and purposely avoiding temporal pooling. We release an audiovisual dataset with 4.5 hours of clarinetist videos together with cleaned annotations which include about 36,000 onsets and the coordinates for a number of salient points and regions of interest. By performing several training trials on our dataset, we learned that the problem is challenging. We found that the CNN model is highly sensitive to the optimization algorithm and hyper-parameters, and that treating the problem as binary classification may prevent the joint optimization of precision and recall. To encourage further research, we publicly share our dataset, annotations and all models and detail which issues we came across during our preliminary experiments.Comment: Proceedings of the First International Conference on Deep Learning and Music, Anchorage, US, May, 2017 (arXiv:1706.08675v1 [cs.NE]

    The Neecham Confusion Scale and the Delirium Observation Screening Scale: Capacity to discriminate and ease of use in clinical practice

    Get PDF
    BACKGROUND: Delirium is a frequent form of psychopathology in elderly hospitalized patients; it is a symptom of acute somatic illness. The consequences of delirium include high morbidity and mortality, lengthened hospital stay, and nursing home placement. Early recognition of delirium symptoms enables the underlying cause to be diagnosed and treated and can prevent negative outcomes. The aim of this study was to determine which of the two delirium observation screening scales, the NEECHAM Confusion Scale or the Delirium Observation Screening (DOS) scale, has the best discriminative capacity for diagnosing delirium and which is more practical for daily use by nurses. METHODS: The project was conducted on four wards of a university hospital; 87 patients were included. During 3 shifts, these patients were observed for symptoms of delirium, which were rated on both scales. A DSM-IV diagnosis of delirium was made or rejected by a geriatrician. Nurses were asked to rate the practical value of both scales using a structured questionnaire. RESULTS: The sensitivity (0.89 – 1.00) and specificity (0.86 – 0.88) of the DOS and the NEECHAM were high for both scales. Nurses rated the practical use of the DOS scale as significantly easier than the NEECHAM. CONCLUSION: Successful implementation of standardized observation depends largely on the consent of professionals and their acceptance of a scale. In our hospital, we therefore chose to involve nurses in the choice between two instruments. During the study they were able to experience both scales and give their opinion on ease of use. In the final decision on the instrument we found that both scales were very acceptable in terms of sensitivity and specificity, so the opinion of the nurses was decisive. They were positive about both instruments; however, they rated the DOS scale as significantly easier to use and relevant to their practice. Our findings were obtained from a single site study with a small sample, so a large comparative trial to study the value of both scales further is recommended. On the basis of our experience during this study and findings from the literature with regard to the implementation of delirium guidelines, we will monitor the further implementation of the DOS Scale in our hospital with intensive consultation

    Elektronische consultatie in de praktijk

    Get PDF

    Single Shot Temporal Action Detection

    Full text link
    Temporal action detection is a very important yet challenging problem, since videos in real applications are usually long, untrimmed and contain multiple action instances. This problem requires not only recognizing action categories but also detecting start time and end time of each action instance. Many state-of-the-art methods adopt the "detection by classification" framework: first do proposal, and then classify proposals. The main drawback of this framework is that the boundaries of action instance proposals have been fixed during the classification step. To address this issue, we propose a novel Single Shot Action Detector (SSAD) network based on 1D temporal convolutional layers to skip the proposal generation step via directly detecting action instances in untrimmed video. On pursuit of designing a particular SSAD network that can work effectively for temporal action detection, we empirically search for the best network architecture of SSAD due to lacking existing models that can be directly adopted. Moreover, we investigate into input feature types and fusion strategies to further improve detection accuracy. We conduct extensive experiments on two challenging datasets: THUMOS 2014 and MEXaction2. When setting Intersection-over-Union threshold to 0.5 during evaluation, SSAD significantly outperforms other state-of-the-art systems by increasing mAP from 19.0% to 24.6% on THUMOS 2014 and from 7.4% to 11.0% on MEXaction2.Comment: ACM Multimedia 201
    • …
    corecore